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Abstract 

When simulating molecular systems using determin- 
istic equations of motion {e.g., Newtonian dynamics), 
such equations are generally numerically integrated ac- 
cording to a well-developed set of algorithms that share 
commonly agreed-upon desirable properties. However, 
for stochastic equations of motion {e.g., Langevin dy- 
namics), there is still broad disagreement over which 
integration algorithms are most appropriate. While 
multiple desiderata have been proposed throughout the 
literature, consensus on which criteria are important 
is absent, and no published integration scheme satis- 
fies all desiderata simultaneously. Additional nontriv- 
ial complications stem from simulating systems driven 
out of equilibrium using existing stochastic integra- 
tion schemes in conjunction with recently-developed 
nonequilibrium fluctuation theorems. Here, we exam- 
ine a family of discrete time integration schemes for 
Langevin dynamics, assessing how each member satis- 
fies a variety of desiderata that have been enumerated 
in prior efforts to construct suitable Langevin integra- 
tors. We show that the incorporation of a novel time 
step rescaling in the deterministic updates of position 
and velocity can correct a number of dynamical de- 
fects in these integrators. Finally, we identify a particu- 
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lar splitting that has essentially universally appropriate 
properties for the simulation of Langevin dynamics for 
molecular systems in equilibrium, nonequilibrium, and 
path sampling contexts. 

1 Introduction 

Simulating the dynamics of molecular systems on a 
digital computer requires that the equations of motion 
be discretized. The resulting discrete-time integration 
algorithm that governs the updates of particle positions 
and velocities will necessarily have properties that dif- 
fer from the continuous equations of motion on which 
it was based. To construct such an algorithm, one 
must decide which properties of the original dynamics 
should be preserved. Even then, a multitude of inte- 
gration schemes may satisfy these properties and still 
recover the continuous stochastic equations of motion 
in the limit of an infinitesimally small time step. 

For integrating the deterministic classical equations 
of motion prescribed by Newtonian dynamics, explicit 
symplectic integration schemes such as velocity Ver- 
let are now widely regarded as being optimal for con- 
densed matter systems for a number of reasons: they 
are reversible, simple to implement, preserve phase 
space volume, require minimal force evaluations, and 
are generally stable over long integration times. 1-4 

For stochastic equations of motion — in which the in- 
fluence of some system components (often the solvent) 
are not represented explicitly, but instead by random 
collisions with fictitious particles — no such generally- 
adopted integrator yet exists. In particular, the dynam- 
ics produced by existing algorithms differ (in a time 
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step dependent manner) in important respects from the 
dynamics of the continuous equations of motion. Nev- 
ertheless, significant effort has been devoted to devel- 
oping such stochastic integrators due to their utility in 
simulating many systems of interest to the chemical, 
biophysical, and physical sciences. 

Driven nonequilibrium systems present their own set 
of special challenges. For example, a powerful set of 
nonequilibrium work fluctuation theorems 5 permit the 
computation of equilibrium properties of systems from 
their nonequilibrium statistics, but require as input the 
distribution of work values associated with the ensem- 
ble of trajectories. Calculations that use naive analo- 
gies of the work for continuous dynamics — failing to 
take into account details of the discrete integration 
scheme — possess systematic biases. 6 

Here, we consider various choices that could be made 
in constructing discrete-time integration schemes for 
Langevin dynamics. By examining the various possible 
Strang splittings of the Langevin Liouville operator, in- 
corporating a novel time step rescaling [Eq. (15)], and 
comparing their resulting properties to several desider- 
ata that have been enumerated in the literature over 
recent decades, we show how it is possible to simul- 
taneously satisfy nearly all these criteria with a sin- 
gle integration scheme [Eq. (7)] that is generally appli- 
cable, simple to implement, computationally efficient, 
and produces thermodynamically-consistent nonequi- 
librium statistics. 

2 Theory 

2.1 The Langevin equation 

A standard framework for the stochastic simulation of 
molecular systems assumes that the variables of inter- 
est evolve according to Langevin dynamics with un- 
correlated Gaussian noise, 7 which represents interac- 
tions with the surrounding environment through fric- 
tional drag and stochastic fluctuations: 

dr = vdt (la) 

dv=—dt-Yvdt + J^-d\N(t). (lb) 
m V pm 

Here r and v are time-dependent position and velocity, 
m is mass, /3 = 1 /k#T, k# is Boltzmann's constant, T is 
the temperature of the environment, 7 is a friction coef- 
ficient (with dimensions of inverse time), and W(?) is 
a standard Wiener process. The force f(t) is due to the 
(in general time-dependent) Hamiltonian 3tf(t) on the 



system with position r{t), as determined by the deriva- 
tive of the potential energy, —dJif(t)/dr, evaluated at 
r(t). For multi-dimensional, multi-particle systems, r, 
v, /, and d\N are vectors, and mis a diagonal matrix 
(see Appendix "Multiple dimensions.") 

While these equations of motion can be solved ex- 
actly for some simple systems, nearly all complex sys- 
tems of interest require computational techniques in or- 
der to generate dynamical trajectories. On digital com- 
puters, this requires discretization of time. The selec- 
tion of an appropriate discrete time integration scheme 
is made difficult by the fact that many discretizations 
may exist that recover the same continuous stochastic 
differential equations of motion in the limit of an in- 
finitesimally small time step, but these schemes may 
possess very different properties for finite time steps. 

2.2 Desiderata 

Finite time step integrators for molecular systems can- 
not hope to reproduce dynamical trajectories of the 
continuous equations of motion exactly. Imprecision 
of experimental measurements ensures that simulated 
initial conditions necessarily deviate from 'true' ones, 
and Lyapunov instability of even deterministic dynam- 
ics ensures the rapid chaotic growth of such deviations, 
making the exact reproduction of a particular trajectory 
impossible even were it desirable. 2 Moreover, artifacts 
are inevitably introduced when one discretizes continu- 
ous equations of motion in a straightforward manner: 
dynamical motion increasingly diverges from that of 
continuous equations of motion with increasing friction 
and/or time step. Instead, the goal is often to repro- 
duce certain statistical (often observable) properties of 
the system, especially in terms of correlation functions 
and (possibly time-dependent) ensemble expectations 
given a set of initial conditions. Thus, a desirable ap- 
proximation scheme should share certain statistical and 
dynamical properties of the ensemble of trajectories as- 
sociated with the exact equations of motion, in lieu of 
being able to exactly integrate trajectories. 

Pastor, et al. 8 proposed that a useful discrete-time in- 
tegrator should reproduce seven quantities associated 
with the continuous-time equations of motion: for a 
free particle (zero-force), the mean-squared displace- 
ment (MSD) as a function of time, the mean-squared 
velocity (MSV), and the velocity autocorrelation func- 
tion (VAC); for a uniform external force, the terminal 
velocity; and for a harmonic potential (linear force), 
the MSD, MSV, and the virial. In Table 1, we define 
these desired dynamical properties and list their ana- 
lytically computed values for the continuous equations 
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of motion. In the Appendix "Determination of rescal- 
ing parameters," we describe in detail the calculation of 
these quantities for our family of integrators. 

One should not expect a discrete algorithm to give 
meaningful results on timescales less than a single time 
step. For practical purposes, it should be sufficient for 
the practitioner to know at which specific point within 
a given time step to measure a given dynamical quan- 
tity to recover the continuous-limit value. Thus for pur- 
poses of this paper, we consider an integrator to satisfy 
a given dynamical property if at some point during each 
time step it maintains the value associated with the con- 
tinuous equations of motion, i.e., we consider the zero- 
force MSD to be satisfied if either (r 2 (n)} or (r 2 (n + ^)} 
equals -j^nAt independent of time step size. Here, 
r(n) denotes the position at the beginning of a time step 
update, while r(n+ \ ) denotes the position at an inter- 
mediate substep during the time step update procedure. 

There are several other criteria that one may want 
an integrator to satisfy, not the least being ease of im- 
plementation and analysis. Additionally, an integrator 
that is computationally efficient should have an accu- 
racy that scales reasonably with time step length (here, 
quadratically, the same accuracy order as popular sym- 
plectic integration schemes for deterministic dynam- 
ics), permitting relatively large time steps; minimize 
the number of force evaluations (one per time step) 
so as to minimize computational effort; easily incorpo- 
rate constraints (typically reflecting covalent chemical 
bonds to light elements such as hydrogen) that elim- 
inate the need for simulation of the fastest modes of 
the system. Path sampling 9 or path reweighting 10 ' 11 
strategies often require an integrator that induces an ir- 
reducible Markov chain {i.e., it is possible to transition 
from any phase space point to any other in a single time 
step through specific choice of the random variables 12 ) 
and that for a given trajectory has a readily evaluated 
path action that governs the probability of that trajec- 
tory within the ensemble. 

Indeed, the task is further complicated when the 
integrator must produce thermodynamically-consistent 
nonequilibrium simulations. To facilitate the use of 
nonequilibrium fluctuation relations 5 and estimators 
derived from them, the integrator must properly split 
dynamics into stochastic, explicit Hamiltonian-update, 
and deterministic substeps that distinguish between the 
heat, work, and shadow work. 13 For the calculated 
works to be thermodynamically meaningful, the inte- 
grator must also have a form symmetric under time- 
reversal. 

Pastor, et al. demonstrate that no members of their 
family of overdamped integrators can simultaneously 



satisfy more than four of their seven desired dynam- 
ical properties. Many other underdamped discretiza- 
tions of the Langevin equation have been proposed 14-24 
that achieve some subset of these enumerated desider- 
ata, yet there still is no widespread agreement on an 
integrator that performs satisfactorily for a broad set of 
purposes. 

2.3 Resolution - time step rescaling 

Drawing inspiration from several popular integrators, 
in this paper we derive a simple family of integrators 
that split the different update types to permit the defi- 
nition of thermodynamically meaningful quantities for 
work and heat. With a novel rescaling of the time 
step, the resulting dynamics preserves six of Pastor et 
al.'s seven dynamical properties for any values of fric- 
tion and time step (the exception being the linear-force 
virial), and furthermore satisfies all of the other desider- 
ata enumerated above, including its utility for nonequi- 
librium simulations schemes involving path sampling 
or reweighting. The resulting integrator is simple to 
implement and could be a general all-purpose replace- 
ment for the various discrete-time Langevin integrators 
now in use. 



3 Integrator Splitting 

We derive a family of integrators by splitting the 
time evolution operator into stochastic and determin- 
istic components 19 and choosing the adjustable param- 
eters to match dynamical quantities from the continu- 
ous equations of motion. We write the one-dimensional 
version here, but the generalization to multiple dimen- 
sions is straightforward (see the Appendix "Multiple 
dimensions"). The Langevin Liouville operator (some- 
times termed the Liouvillian) 25 can be naturally written 
as a sum of four parts ££ = Jz? + j£f v + ££ r + ££ h . The 
first operator represents stochastic thermalization 26 via 
an Ornstein-Uehlenbeck operator, 
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(2) 



the next two operators represent deterministic Newto- 
nian evolution of velocity and position, 



m dv ' 



(3) 



and the last operator represents the time evolution of 
the system Hamiltonian according to the predetermined 
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schedule (or protocol) A, 

e A, ^jr(n) = jr(n+\) , 



by: 



(4) 



where n denotes the time step index and t = nAt the 
simulation clock time for time step At. (Note that in 
the case of a time-independent Hamiltonian, e AtJ ^ h is 
the identity operator.) We approximate the dynamics 
over a time At by applying a series of Strang (symmet- 
ric Trotter) operator splittings, 27-29 



Q At[A+B+C+D] 

= ef A e *[B+c+D] e Y A + 0(At 3 ) 

At . At „ . _. At „ At . 

= e2 A e 2 B e ^ c+D ^ e2 B e T + 0(At 3 ) (5c) 



(5a) 
(5b) 



At . At „ At , 



At „ At „ At 



e i A e 2 B e2 c e AtD e2 c e 2 B e 2 A + 0(At 3 ), (5d) 
where (A,B,C,D) represents a permutation of 

There are six Strang splittings of the Liouville opera- 
tors J^,«S? r ,«Sf v . The Hamiltonian update operator Jzf/j 
commutes with Jzf and with Jz? r , so for each of these 
six splittings there are only two unique placements of 
Jz?/,. For each of the six splittings, one placement of Jzf/, 
interleaves the position, Hamiltonian and deterministic 
velocity updates in such a way as to require multiple 
force evaluations per step, making the scheme compu- 
tationally inefficient. Thus there are six distinct split- 
tings that each give rise to different finite time step dy- 
namics and require only one force evaluation per step. 
Notably, because the error in each Strang splitting is 
&{At 3 ), all are identical to the true Liouville operator 
Jz^in the limit Af->0+. 

One such splitting is a stochastic generalization of 
Velocity Verlet that we call OVRVO (to denote the re- 
spective ordering of Ornstein-Uehlenbeck (O), deter- 
ministic velocity (V), and deterministic position (R) up- 
dates (in analogy to the nomenclature of Leimkuhler 
and Matthews 24 ), 



(6) 



At Al „ Af „ At Af „ A/ „ 

e 2 ° e 2 e 2 e oa - c h e 2 r e 2 v e 2 

a b c d e f g 

The Hamiltonian-update step is placed to minimize the 
number of force evaluations. 

For this operator splitting, a single update step that 
advances the simulation clock by At is given explicitly 
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v{n + \) = ^v{n) + x — ^Y + (n) (7a) 
y pm 



, In / In b At f(n) 

v( n + I)=v(« + i) + — J -j^ 



r{n+\) = r{n) + — v{n+\) 



bAt 

Jf?(n) -> J4?(n + 1) 

r{n+\) = r{n+\) + —v(n + \) 
, , , , bAt f(n+l) 



(7b) 

(7c) 
(7d) 
(7e) 

(7f) 



l-a 



v(n + l)=Vav(n+l) + J^^-(n+l). (7g) 

V P m 



Here, a = exp(— jAt), and J/ + and ,jV~ are inde- 
pendent, normally distributed random variables with 
zero mean and unit variance (hence, when scaled by 
(j3m) -1 / 2 , distributed according to the equilibrium 
Maxwell-Boltzmann velocity distribution). 

The substeps in Eq. (7) are the finite difference ex- 
pressions of the corresponding suboperators in Eq. (6). 
The initial (a) and final (g) operators randomize the 
velocity and leave the position unchanged, mixing 
the old velocity with a Maxwell-Boltzmann random 
variate (with old velocity weighted according to a = 
exp[— yAt\). These operators can be analytically inte- 
grated to give the first [Eq. (7a)] and last [Eq. (7g)] 
substeps that are stochastic, Markovian, and detailed- 
balanced (with respect to the true canonical mea- 
sure 30 ). 19,31 The operators (b) and (f) correspond to 
deterministic velocity updates, while (c) and (e) corre- 
spond to deterministic position updates. Together they 
are approximated by the finite difference expressions 
of substeps [Eq. (7b),Eq. (7c),Eq. (7e),Eq. (7f)], which 
together constitute the deterministic and symplectic ve- 
locity Verlet integrator, 28 32 but slightly altered by an 
effective time step rescaling < b < 1 , chosen to main- 
tain the continuous-limit zero-force diffusion coeffi- 
cient and terminal drift under a uniform external force, 
regardless of friction coefficient or time step (derived 
in the Appendix "Determination of rescaling parame- 
ters")- Operator (d) and its finite difference expression 
as Eq. (7d) makes explicit the midpoint Hamiltonian 
update. Note that several other popular integrators do 
not explicitly include the update of the system Hamilto- 
nian, presumably because they are not concerned with 
calculating distributions of the work generated by ex- 
plicit Hamiltonian changes. 

The alternative splittings with one force evaluation 
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per time step include ORVRO (a stochastic generaliza- 
tion of position Verlet 28 ), 



(8) 



At „ At „ At „ At ^ At „ At „ 

e 2 °g 2 r g 2 h e^ v e 2 *g 2 g 2 • 



RVOVR (an explicit Hamiltonian-update generaliza- 
tion of Leimkuhler and Matthew's 'ABOBA'), 



„Af[j%+J%+J&fr+-S%] 



At ™ At „ At „ At „, At „ At ™ 

g 2 r g 2 A g 2 v e e 2 v e 2 *g 2 r 



(9) 



VRORV (an explicit Hamiltonian-update generaliza- 
tion of Leimkuhler and Matthew's 'BAOAB'), 



(10) 



At „, At „ At ™ . „ At „ At „ At „ 

e 2 v e 2 e 2 • Ch e cst - C °e 2 h e 2 e 2 v • 



ROVOR, 



(11) 



At „ At „ At „ At At „ At^, 

e 2 2 2 v e 2 *e 2 2 



and VOROV, 



(12) 



At ™ At „ At „ . v At^ At ^ At „ 

e 2 v e 2 °e 2 r e h e 2 r e 2 2 -* v 



Since for a single time step At the error is 0(A? 3 ) 
for any of these Strang splittings, when applied over 
N = t /At time steps the global error is 0(At 2 ). Figure 1 
confirms that OVRVO errors in the energy are second 
order in the time step At. 

4 Time step rescaling recovers 
dynamical properties 

Standard integrators implicitly set the parameter b to 
unity in Eqs. (7b), (7c), (7e) and (7f). However, we 
show later that a non-unit b recovers, for arbitrarily- 
large time step, the continuous-limit values of impor- 
tant dynamical quantities. The time step rescaling can 
be most simply derived by noting that in the absence 
of a potential, for any of the six splittings, a trajec- 
tory is isomorphic to a semi-flexible Gaussian polymer 
chain: 33 the set of positions correspond to the beads on 
the polymer chain, and the displacement vectors during 
single time steps correspond to the inter- monomer bond 
vectors in the chain. In one dimension, the displace- 
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Figure 1: Numerical demonstration that the 
errors in energy of the OVRVO integrator 
[Eq. (7)] are second order in At. Here, we use 
a previously described model system 18 of a har- 
monic potential, with unit spring constant, fric- 
tion coefficient, temperature, and mass, with initial 
conditions r(0) = v(0) = 0. The error is the abso- 
lute deviation of the estimate of (r 2 (l) + v 2 (l)) = 
0.97961 1 1900 . . . (twice the energy) computed by 
ensemble averaging over 10 8 independent realiza- 
tions. The line is the graph of the function At 2 . 



ment is a normally distributed random variable with 
zero mean and variance Gq = (bAt) 2 / (j8m) (in accor- 
dance with Maxwell-Boltzmann velocity statistics and 
the time interval b At), and the autocorrelation between 
velocities separated by N steps decays exponentially as 



[v(N) v(0)> 



JV 



j3m 



(13) 



For this system the mean square displacement in the 
large time (yt S> 1) limit is 33 



[r(AO-r(0)] 2 > 



No%- (14a) 

I— a 

w <M0!co,h^. (14b) 
pm 2 



The time step rescaling b results from equating this ex- 
pression to the mean-squared displacement of a freely 
diffusing particle in one dimension, 2Dt = 2t/(pmy), 
for a total simulation time over N steps, t = N At. In 
particular, the time step used in the position update step 
is rescaled by the factor 



' 2 yAt 

tanh - — 

yAt 2 



(15) 
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ensuring that the effective free-particle diffusion con- 
stant is independent of time step length (see Figure 2). 
In the low friction limit yAt < 1, b = 1 - 0([yAt] 2 ), 
and in the high friction limit y At » 1, b = yjl / (yAt). 
Note that even though the position update utilizes an 
effective time step of bAt, the simulation clock is still 
advanced by the full time step At. We derive the time 
step rescaling from a different perspective and in more 
detail in the Appendix "Determination of rescaling pa- 
rameters." 
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Figure 2: Time step rescaling recovers correct 
field-free diffusion as a function of time step. 

Root mean-squared displacement versus relative 
time step length At at time t = 64 for a freely dif- 
fusing particle in one dimension, with unit mass, 
temperature, and friction coefficient, subject to 
the OVRVO integrator [Eq. (7)] without time step 
rescaling, b = 1 (o), or with time step rescal- 
ing, Eq. (15) (x). 



Euler-Maruyama method ), 

. . . At fin) / 2At . 

y m y pmy 

ROVOR and VOROV interpose a velocity randomiza- 
tion substep between the deterministic velocity and po- 
sition updates. In this yAt 3> 1 limit, the velocity 
is completely randomized before each position-update 
substep, and thus the position updates are completely 
independent of the Hamiltonian. The other four split- 
tings preserve the influence of the Hamiltonian on the 
dynamics even in this limit of large friction (or large 
time step). 

OVRVO also reduces to several other popular inte- 
grators in other limits or approximations. If the effec- 
tive time step rescaling for the deterministic substeps 
is omitted, such that b = 1, then OVRVO is equiva- 
lent to an integrator described by Adjanor, Athenes, 
and Calvo; 17 and by Bussi and Parrinello. 19 If we also 
combine all stochastic and deterministic velocity up- 
dates [Eq. (If), Eq. (7g), Eq. (7a), Eq. (7b)], we recover 
the integrator of Athenes; 15 and recasting the Athenes 
integrator as a Verlet-style integrator (only monitor- 
ing position) we converge with the Brunger-Brooks- 
Karplus (BBK) integrator 14 in the low friction limit. 
If instead we only combine adjacent pairs of stochas- 
tic and deterministic velocity updates [Eq. (7a) with 
Eq. (7b), Eq. (7f) with Eq. (7g)] (still with no time 
step rescaling) we produce the low friction limit of the 
Langevin Leapfrog integrator of Izaguirre, Sweet, and 
Pande. 1021 



5 Integrator properties 

5.1 OVRVO generalizes other popu- 
lar integrators 

For an explicitly time-independent system Hamilto- 
nian, this family of integrators reduces to various 
other schemes in certain limits or approximations. At 
zero friction, y = and a = b = 1 , thus the stochas- 
tic substeps have no effect, so OVRVO, VRORV, 
and VOROV are identical to the deterministic veloc- 
ity Verlet integrator, whereas ORVRO, RVOVR, and 
ROVOR are identical to position Verlet. In the high- 
friction or long-time limit, a = and b = ^Jl / (yAt), 
and OVRVO reduces to the Euler integrator for over- 
damped Langevin dynamics 34 (also known as the 



5.2 Nonequilibrium work 

There is significant interest in probing the probability 
distribution of work required during a nonequilibrium 
driving process, which via the work fluctuation theo- 
rems 5 can report on various system properties. Such 
usage requires careful splitting of thermodynamically 
distinct energy changes. 6 

The total energy change AE during the nth full time 
step of OVRVO can be cleanly separated into heat Q, 
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protocol work W prot , and shadow work W s h a d- 

AE = Q + W (17a) 

= e+w prot +w shad (nb) 

Q=*r(y(n+\))-*r(y(n)) (17c) 
+ ^(v(n + l))-^(v(n+|)) 
W prot = ty(r(n+ i),n+ l) -W(r(n+ ±),n) (17d) 
W shad = ^(v(« + i))-^(v(« + i)) (17e) 
+ W(r(n + ±),n))-W(r(n),n) 
+ W(r(n + l),n+l)-W(r(n + ^),n+l) 
+ ^(v(n+l))-^(v(n + D). 

Here, ^ (r,n) is the potential energy for configuration 
r under Hamiltonian J^{n), and £?{y) = ^mv 2 is the 
kinetic energy for velocity v. The five other splittings 
permit similar decompositions of energy changes into 
heat, protocol work, and shadow work. 

5.3 Constraints 

Constraints, such as rigid bond lengths, can be read- 
ily incorporated into the dynamics using standard 
techniques. 23 The symplectic part of the integra- 
tor can be constrained with the standard RATTLE 
(Eq. (7b),Eq. (7f)) algorithm. 2 - 36 Since RATTLE is 
symplectic if iterated to convergence, 37 adding con- 
straints does not interfere with the underlying re- 
versibility of the dynamics. Similarly, the velocity 
randomization substeps, Eq. (7a) and Eq. (7g), can be 
constrained with RATTLE, which modifies the heat 
flow, but preserves detailed balance. 23 Consequently, 
constrained versions of this family of integrators still 
obeys the precepts of nonequilibrium thermodynam- 
ics, 6 with the same definitions of heat, protocol work 
and shadow work [Eq. (17)], provided that the defini- 
tion of free energy is altered to account for the con- 
strained degrees of freedom. 23 

5.4 Computational efficiency 

All six splittings require one force evaluation per time 
step. For OVRVO, for example, the force in Eq. (7f) 
is identical to the force in Eq. (7b) of the next time 
interval. Measuring heat requires two evaluations of 
kinetic energy per time step for all six splittings, for 
OVRVO just after Eq. (7a) and just before Eq. (7g). 
Separately measuring protocol work and shadow work 
requires two potential energy evaluations per time step, 
once each just before and after the Hamiltonian- update 
substep. Shadow work measurement also requires the 



kinetic energy evaluations already needed to measure 
heat, as shadow work and heat are the only processes 
that change the kinetic energy. 

However, if only the total thermodynamic work W = 
AE — Q is of interest, this can be calculated given the 
heat Q(") during each step n — > n + 1 [Eq. (17c)] (easily 
accumulated during integration) and the total energy at 
the beginning and end of the simulation, 

W = <% (r (AO , N) + ST (v (AO ) 
-<^(r(0),0)-^(v(0)) 

N-l 

- £ Q in) ■ (18) 

n=0 

The OVRVO integrator requires two normal random 
numbers per velocity per time step, one each for the ini- 
tial [Eq. (7a)] and final [Eq. (7g)] velocity randomiza- 
tions. Splitting the velocity randomization across time 
steps ensures that the dynamics is microscopically re- 
versible and Markovian, and that the induced Markov 
chain is irreducible. (The first stochastic substep gen- 
erates the velocity needed to move to the final position, 
and the last stochastic substep generates the final ve- 
locity.) ORVRO and VOROV also induce irreducible 
Markov chains. RVOVR, VRORV, and ROVOR effec- 
tively agglomerate the two velocity randomizations into 
a single randomization involving one random number, 
and hence do not generate irreducible Markov chains. 
Irreducibility has utility for path sampling 9,38 ^ and 
path reweighting 10,11 schemes, since any proposed dis- 
crete time trajectory through phase space is a valid tra- 
jectory of an irreducible Markov chain. However, many 
practical applications do not require irreducibility, so 
one can halve the number of required random vari- 
ables for OVRVO and ORVRO by combining the last 
stochastic substep of one full step with the first stochas- 
tic substep of the next full step, and for RVOVR and 
VRORV by combining the two stochastic substeps of 
a given full step. ROVOR and VOROV separate their 
stochastic substeps such that they cannot be easily com- 
bined. Leimkuhler and Matthews show that in the high 
friction limit and at medium time step VRORV with this 
single velocity randomization (and time-independent 
Hamiltonian) is second-order accurate when other in- 
tegrators become first-order. 24 

When only the total thermodynamic work is of inter- 
est, we can combine the last two velocity updates of 
Eq. (7) with the first two updates from the next step, 
and combine the two position updates, to give a three 



8 



substep stochastic Leapfrog integrator: 



v(n + i) =av(n— + . 



1-a 2 
/3m 



^(n) (19a) 



+ (!+«) 



/(«) 



r(n+l) = r(ra)+6A?v(« + 5) 
J^(n)-> Jf(n + 1) . 



2 m 



(19b) 
(19c) 



Under these circumstances, RVOVR, ROVOR, and 
VOROV reduce to similar three substep integrators, 
but, due to their sequencing of substeps, ORVRO and 
VRORV each only reduce to a five substep integrator. 

5.5 Path action 

The path action S^\X\ is a necessary quantity for many 
path sampling 9,38 ^ and path reweighting 1011 tech- 
niques. The conditional path probability functional is 
a product of single time step probabilities, 



P[X|*(0),A] = e -^™»A] 

N-l 

Y[P[x(n + l)\x(n)] 



(20a) 
(20b) 



n=Q 



Here, X is a trajectory through phase space between 
jc(0) = {r(0),v(0)} at time and x{NAt) at NAt. Each 
time step probability is determined by the probability 
of the requisite random variables, which for OVRVO is 



P[jc(n+l)|jc(n)] 
f3m 



bAt(\-a) 



(21) 

p(<yf + (n)) p{yY-{n + \)) . 



The first factor f}m/\bAt{\ — a)] is the Jacobian for 
the change of variables from {r(n + l),v(n + 1)} to 
{^V + (n),^V^(n + 1)}, and the probabilities are nor- 
mal with zero mean and unit variance, 



1 



p{^V ± ) = -^=exp 



2k 



1 



±\2 



(22) 



where 



jV + {n) = 
jy-{n+\) = 



[5m 
\-a 
f3m 
Ta 



[ v {n+\)-^v{n)] (23a) 



[v{n+\)-y/av(n+\)\ 



(23b) 



The intermediate velocities can be determined by the 
initial and final position, 



v(n+\\ 



v{n+l 



^Hn^-r^-fm (24a) 

1 r . . .-, bAt f(n+l) 

\r(n + l)-r(n)] H — . 

bAt 1 y 1 WJ 2 m 

(24b) 



Combining Eq. (20) and Eq. (24) gives the action as a 
function of position and velocity at the unit time steps, 



y[X\x(Q),K] =ln 



2%{\-a)bAt 



N 



S2(l-a) 



/3m 

r(n + \) - r(n) bAt f(n) 



bAt 



m 



(25) 
\fav{n 



r(n + \)-r(n) bAt f(n+\) 



bAt 



+ 



m 



-v(n+r 



The path probability obeys the expected symmetry un- 
der time-reversal, 6 where the work is defined as in 
(Eq. (17d),Eq. (17e)). 

VOROV has a similarly simple expression for the 
path action. The path action for ORVRO additionally 
requires the evaluation of the force and its derivative 
at the half-step position, r(n+\), hence requires paths 
of twice as many points, and therefore is of lesser util- 
ity. RVOVR, VRORV, and ROVOR induce reducible 
Markov chains and thus these splittings have infinite 
path actions for the vast preponderance of paths. 

6 Results 

All of our Strang splittings lead to seven-substep inte- 
gration schemes that are time-symmetric; are second- 
order accurate in At; make Hamiltonian changes ex- 
plicit; distinguish between heat, protocol work and 
shadow work; and easily incorporate constraints. Six of 
the twelve unique splittings require a single force eval- 
uation per time step and thus are computationally effi- 
cient. Setting a = e~ yAt and b = ^ tanh ^ gives for 
all six one-force-evaluation splittings the continuous- 
limit MSD, MSV, and VAC in the zero-force case, and 
the linear-force virial with asymptotic error 0(At 2 ). 
The six splittings differ in the remaining desiderata. We 
summarize their properties in Table 2. 
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7 Conclusions 

As a stochastic generalization of a standard determinis- 
tic technique, OVRVO implements what can be consid- 
ered a form of velocity Verlet with velocity randomiza- 
tion (VVVR). It is an integrator of general utility, sat- 
isfying six of Pastor et al. 's seven dynamical properties 
(still producing artifacts in the linear-force virial), as 
well as the remaining enumerated desiderata. It is well- 
suited for the study of nonequilibrium thermodynam- 
ics: work is easily measured because the Hamiltonian 
changes are made explicit, and these measured works 
are thermodynamically meaningful because OVRVO 
distinguishes between heat, protocol work, and shadow 
work. OVRVO's simple form for the path action facili- 
tates its use in trajectory reweighting or path sampling 
methods. Our novel time step rescaling maintains (for 
an arbitrary time step) various continuous-limit dynam- 
ical quantities, in particular the uniform-force terminal 
drift and linear- force fluctuations in position and veloc- 
ity. Thus, within stability limits, computations can be 
speeded while maintaining reasonable dynamics. Fi- 
nally, OVRVO generalizes several popular integration 
schemes and thus relates naturally to the existing lit- 
erature. Its extension to a multiple time step integra- 
tor 41 is straightforward, requiring only replacement of 
the inner symplectic step [Eq. (7b)-Eq. (7f)] with a cor- 
responding symplectic multiple time step integrator for 
deterministic dynamics. 

By contrast, alternative splittings suffer from short- 
comings of varying severity and number. ROVOR 
and VOROV produce uniform-force terminal drift and 
linear-force fluctuations of position and velocity that 
differ from continuous-limit values, lose Hamiltonian 
dependence in the limit of large yAl, and require 
two random numbers per time step even for reducible 
Markov chains. RVOVR and VRORV induce Markov 
chains that are not irreducible and thus these split- 
tings have limited utility for path-sampling schemes. 
ORVRO seems similarly useful to OVRVO, but in path 
sampling applications requires paths with twice the 
number of points as OVRVO. 
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A Determination of rescaling 
parameters 

Here we determine the coefficients a and b in the 
OVRVO equations [Eq. (7)] by requiring the exact sat- 
isfaction of six of the seven dynamical quantities pro- 
posed by Pastor, et al. 8 (the linear-force virial is not 
preserved). Except where noted, the same analysis 
produces equivalent results for the ORVRO, RVOVR, 
VRORV, VOROV and ROVOR splittings. 

A.l Zero-force mean-squared velocity 

The continuous-limit MSV is fixed by the scale of the 
random velocity fluctuations. In the zero-force case, 
OVRVO reduces to the following scheme 

v (n + \)=av{n-\) + ^^ i ,yV(n) (26a) 
r(n+\) =r(n)+bAtv(n+^) . (26b) 

Ensemble averaging after multiplying Eq. (26a) in turn 
by r(n) , r(n + 1 ) , v(n + \ ) produces 

(r{n)v{n + \))=a(r{n)v{n-\)) (27a) 

{v 2 {n+\))=a{v(n-) i )v{n+\))+ l j^- 

(27b) 

(v(n-\) v (n + \))=a{v 2 {n-\)) . (27c) 

Angled brackets denote an ensemble average. Combin- 
ing Eq. (27b) and Eq. (27c) and rearranging gives: 

<v>+J»-^ = -(<»->-l)>-^) • gs> 

So for a < 1 the MSV will asymptotically approach 
the Maxwell-Boltzmann result (/3m) _1 regardless of its 
initial value. 
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A.2 Zero-force velocity autocorrela- 
tion function 

Requiring the continuous-limit zero-force velocity au- 
tocorrelation function fixes the parameter a. For dis- 
crete Langevin dynamics, averaging after multiplying 
Eq. (26a) by v(n + \ — i) for t > produces 

(v(n + \ - i)v(n + I)) = a(v(n + \ - i)v(n — |)) . 

(29) 

Induction using this and the equilibrium result (v 2 (n — 
i)) = (/3m)- 1 reveals that 



(v(n-i)v(n -§+£)> 



/3m 



(30) 



Comparison with the zero-force velocity autocorrela- 
tion function for continuous-time Langevin dynam- 



ics 



42 



(v(O)v(O) 



-yt 



[5m 



(31) 



fixes a = e rAt . 



A.3 Zero-force diffusion coefficient 

Requiring the continuous-limit zero-force MSD fixes 
the time step rescaling b in Eq. (7c) and Eq. (7e). En- 
semble averaging after multiplying Eq. (26b) in turn by 
r(n),v(n+ ^),v(n — \) produces 

(r(n)r(n + 1)) = (r 2 (n)) + bAt(r(n)v(n + ±)> 

(32a) 



(r 2 (n+\)) = (r(n)r(n+\)) 



(32b) 



+ bAt(r(n + \)v(n + \)) 
(r(n + \)v{n + \)) = (r(n)v(n + ±)> (32c) 
+ M ? (v 2 (« + i)>. 



Assuming that the MSV starts (and therefore remains) 
at 

(33) 



1 

/3m 



and that the particle begins at r(0) = 0, combining 
Eq. (32c), Eq. (27a), and Eq. (33) produces 



* w us bAt " : bAt 1 - a 
<Kn + l)v(» + ^)> = ^ga' = ^^- (34a) 

Mfa-a n+1 



(r(n)v(n+i)) 



/3m 1 — a 



(34b) 



Combining these results with Eq. (32a) and Eq. (32b) 
gives 



(r 2 (n + \)) = (r 2 (n)) + 



(bAt) 2 
f3m 



\-a n+l 
2-A 1 



l-a 



{bAt) 2 
/3m 

(bAt) 2 
f3m 



fn + 1 



l+a 2a(\ — a 



1 — a 



(l-a) 2 



(n + ^Y^ , «» 1 



(35a) 
(35b) 

(35c) 



Equating this expression to the (Fickian) MSD in the 
continuous limit 2Dt = 2?/(/3my) of a particle freely 
diffusing in one dimension during a total simulation 
time t = (n + \)At, leads to the effective time step 
rescaling of Eq. (15), 



^Diff 



(36) 



A.4 



Terminal drift under uniform ex- 
ternal force 



Requiring the continuous-limit terminal drift under 
a uniform external force fixes b in Eq. (7b) and 
Eq. (If). The terminal drift equals (r(n + 1) — 
r(n)) I At = b(v(n + \At)) once transients have died out 
and this value stops changing. The velocity reaches its 
terminal value when it (on average) doesn't change over 
a full step of the integrator, satisfied when 



(v(»-i)> = <v(»+i)> 



:a(v(n-\)) + (l+a) b -fL 



Thus the terminal drift is 

b 2 Atf 



bv te 



- cothl^ 

2 m 2 



(37) 
(38) 

(39) 



The value of b that equates this to the value for the con- 
tinuous equations of motion, // (my), is 



Vift = ^tanh^— 



'Diff • 



(40) 



Thus the time rescaling &Diff restoring the continuous- 
limit zero-force diffusion constant matches the time 
rescaling ^Drift restoring the continuous-limit terminal 
drift under a uniform external force, and thus applying 
a single time rescaling b = &Diff = ^Drift maintains the 
fluctuation-dissipation relation. 
The ORVRO, RVOVR, and VRORV integrators also 
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recover the continuous-limit terminal drift // (my) for 
the same bom, Eq. (40). VOROV and ROVOR inte- 
grators produce a terminal drift that differs from the 
continuous-limit value by a factor of sech(yA?/2). In 
particular, in the limit of large At the terminal drift goes 
to zero, and the dynamics are thus insensitive to force. 

A. 5 Mean-squared displacement, 
mean-squared velocity, and virial 
for linear force 

For a time-independent harmonic potential U = \kr 2 , 
the OVRVO integrator reduces to 



l-a 



v{n + \) = y^v(«) + J— <sK + (n 

v (n + \)=v(n + \)- — r(n) 
r (n+\) = r(n) + — v(n+\) 



r{n+\) = r{n+\) + b ^v(n+\) 
kbAt 



(41a) 

(41b) 
(41c) 
(41d) 



v{n+\)=v{n+\)-'^ m -r{n + \) (41e) 



l-a 



v(n + l) = Vav(n + l) + J^ <yf-(n+\) . (4 If) 

Ensemble averaging after multiplying Eq. (41a) in turn 
by v(n + l),v(n),r(n) produces 



(v 2 (n + - 4 )) = Mv(n)v(n + l)) + ^ (42a) 



l-a 
/3m 



{v{n)v{n + \)) = ^a~{v 2 {n)) (42b) 
(r(n)v(n+l)) = Va-(r(n)v(n)) (42c) 

This and similar ensemble averages of Eq. (41b) and 
Eq. (41f) produces a system of 18 linear equations for 
18 unknowns. Solving for the MSD and MSV, we find 
that: 



(r{n+\?) 
(r(n) 2 ) 



1 

1 1 



pki-i^bAty 

(v(») 2 > = (v(n + J) 2 > = (v(» + |) 2 > = -L 

1\2\ 



(v{n + \Y) 



Pml-l^bAty 



(43a) 
(43b) 

(43c) 
(43d) 



So the MSD at half-steps and the MSV at whole-steps 
match those of the continuous equations of motion. 



A similar calculation reveals that ORVRO also pro- 
duces continuous-limit MSD at half-steps and MSV at 
whole-steps. Conversely, RVOVR and VRORV pro- 
duce continuous-limit MSD at whole-steps and MSV 
at half-steps. 

Since for each of these four integrators the MSD 
and MSV attain their continuous-limit values at dif- 
ferent points, the virial at any given point in time al- 
ways differs from the continuous-limit value for each 
of these four splittings, with 0(At 2 ) error. VOROV and 
ROVOR integrators produce MSD with 0{At 2 ) error, 
and MSV with 0(At 4 ) error at timepoints n and n + \, 
respectively, and thus virials with at least 0(At 2 ) error. 

B Multiple dimensions 

This family of integrators is trivially extended to mul- 
tiple degrees of freedom. The position r, velocity v, 
and force / become vectors r, v, and f, respectively. 
The mass m becomes a diagonal matrix m. The normal 
variates ^V ± (n) become random vectors N ± («), with 
averages and covariance 

(N? («)> = 0, <^ P (m)^.») = SijS^Spa , 

(44) 

where p,a G {+,—}. The mass m and coefficients a 
and b become diagonal matrices m, a, and b, respec- 
tively, such that aij = 5,- 7 exp(— JiAt) and 



bij = Sij\ 



tanh 



(45) 



The resulting equations for OVRVO are: 

\{n + l) = a 1/2 -v(ra) (46a) 

/ , \ 1/2 

+ (^(I-a)-m- 1 J -N+(n) 
y{n + \) =v(n+J) + yb -m- 1 -^) (46b) 



At 



r(n+ i) = r(n) + —b-y{n+\ 

J?(n)^J>t?(n + l) 



(46c) 
(46d) 



At, 
At. 



r(n + l) =r(n + \) + — b -v(n+\) (46e) 



y(n+l) =y(n + \) + — b -voT 1 -t{n+\) (46f) 

(46g) 



v(« + l)=a 1 / 2 -v(« + |; 



1 \ 1/2 
( ^(I-aJ-m-^ -N-(n+l), 



for identity matrix I. 
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C Metropolization 

The shadow work (resulting from the finite time step 
of discrete Langevin integrators) drives the system out 
of equilibrium. 6 Metropolization is a popular method 
to recover the equilibrium distribution (though at the 
cost of unphysical dynamics). For example, in gen- 
eralized hybrid Monte Carlo, 12,43 the shadow work is 
used in a Metropolis acceptance step. In order to main- 
tain a reasonable acceptance rate the time step At must 
scale with the dimension d as d~ l l A , if the different 
degrees of freedom are uncorrected. 44 By similar ar- 
guments, when omitting the Metropolis step, achiev- 
ing statistically robust work-reweighted expectations 
(or minimizing departures from the desired distribution 
by keeping the shadow work small in magnitude) also 
requires scaling the time step At as d~ l l A . As a result, 
the number of time steps (and hence force evaluations) 
required to simulate a given time interval to a given ac- 
curacy scales as J 1 / 4 . 
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